Skip to main content
Version: 1.0

Kafka Connect

  • When building data pipelines:
    • Timeliness
    • Reliability
  • Default port in distributed mode

Components

  • Connector

    • Defines how data will be copied
    • They perform the copy of the data using jobs by breaking the job into a set of Tasks
    • Two types of connectors:
      • Source connector: push data to Kafka topic
        • Sink connector: pull data from kafka
    • Is responsible for three things:
      • How many tasks to run for the connector
      • How to split data-copying between tasks
      • Getting configurations of tasks from the workers and pass it along
  • Tasks

    • Responsible for getting data in and out of Kafka
    • They are initialized by receiving a context from the connector (Source or Sink context)
    • Task states are stored in special topics config.storage.topic and status.storage.topic and managed by the associated connector
  • Workers

    • They are the container process that execute connectors and tasks
    • Responsible for
      • Handle HTTP request and their configurations
      • Store connectors and tasks configurations
      • Start connectors and thier tasks and passing the appropriate configurations along
      • Commit offset for source and sink connectors
      • Handle retries when task fails
    • When worker fails, tasks are rebalanced over active workers, but when tasks fail they are considered as an exception and no balance is triggered
    • Two types:
      • Standalone Workers: single process is responsible for executing all tasks
      • Distributed Workers: starts many process using group.id
  • Converters: convert data from kafka to source system

    • JSON converter: is part of Kafka
    • Avro converter: provided by Confluent Schema Registry

Internal topics

  • connect-configs
  • connect-offset
  • connect-status
  • Tool
  • Connect version: curl http://localhost:8083/
  • Available connector pluging: curl http://localhost:8083/connector-plugins
  • All connectors: curl http://localhost:8083/connectors